Quora Question Duplication
نویسندگان
چکیده
We explored two approaches based on Long Short-Term Memory (LSTM) networks on the Quora duplicate question dataset. The first model uses a Siamese architecture with the learned representations from a single LSTM running on both sentences. The second method uses two LSTMs with the two sentences in sequence, and the second attending on the first (word-by-word attention). Our best model achieved 79.5% F1 with 83.8% accuracy on the test set.
منابع مشابه
Duplicate Quora Questions Detection
Quora is a platform to ask questions and connect with people who contribute unique insights and quality answers. In this paper, we are mainly focusing on the duplicate questions detection. The main idea is to first vectorize questions and extract features, train and predict using machine learning techniques based on question vectors and features previously built. We implement two approaches to ...
متن کاملWho is Authoritative? Understanding Reputation Mechanisms in Quora
As social Q&A sites gain popularity, it is important to understand how users judge the authoritativeness of users and content, build reputation, and identify and promote high quality content. We conducted a study of emerging social Q&A site Quora. First, we describe user activity on Quora by analyzing data across 60 question topics and 3917 users. Then we provide a rich understanding of issues ...
متن کاملAnalysis and Prediction of Question Topic Popularity in Community Q&A Sites: A Case Study of Quora
In the past few years, Quora a community-driven social platform for question and answering, has grown exponentially from a small community of users into one of the largest and reliable source of Q&A on the Internet. Quora has a built-in social structure integrated to its backbone; users can follow each other, follow question, topics etc. Apart from the social connections that Quora provides, it...
متن کاملSiamese Neural Networks with Random Forest for detecting duplicate question pairs
Determining whether two given questions are semantically similar is a fairly challenging task given the different structures and forms that the questions can take. In this paper, we use Gated Recurrent Units(GRU) in combination with other highly used machine learning algorithms like Random Forest, Adaboost and SVM for the similarity prediction task on a dataset released by Quora, consisting of ...
متن کاملIdentifying Quora question pairs having the same intent
This paper presents a system which uses a combination of multiple text similarity measures of varying complexities to classify Quora question pairs as duplicate or different. The solution uses a support vector classifier model trained using the precomputed features ranging from longest common sub-string and sub sequences to word similarity based on lexical and semantic resources. The scope of t...
متن کامل